219 research outputs found

    A Mathematical Measurement For Korean Text Mining and Its Application

    Get PDF
    Department of Mathematical SciencesIn modern society we are buried beneath an overwhelming amount of text data on the internet. We are less inclined to just surf the web and pass the time. To solve this problem, especially to grasp part and parcel of the text data we are presented, there have been numerous studies on the relationship between text data and the ease of the perception of the text???s meaning. However, most of the studies focused on English text data. Since most research did not take into account the linguistic characters, these same methods are not suitable for Korean text. Some special method is required to analyze Korean text data utilizing the characteristics of Korean. Thus we are proposing a new framework for Korean text mining in various texts via proper mathematical measurements. The framework is constructed with three parts: 1) text summarization 2) text clustering 3) relational text learning. Text summarization is the method of extracting the essential sentences from the text. As a measure of importance, we propose specific formulas which focus on the characteristics of Korean. These formulas will provide the input features for the fuzzy summarization system. However, this method has a significant defect for large data set. The number of the summarized sentences increases with the word count of a particular text. To solve this, we propose using text clustering. This field has been studied for a long time. It has a tradeo??? of accuracy for speed. Considering the syllable features of Asian linguistics, we have designed ???Syllable Vector??? as a new measurement. It has shown remarkable performance as implemented with text clustering, especially for high accuracy and speed through e???ectively reducing dimensions. Thirdly, we considered the relational feature of text data. The above concepts deal with the document itself. That is, text information has an independent relationship between documents. To handle these relations, we designed a new architecture for text learning using neural networks (NN). Recently, the most remarkable work in natural language processing (NLP) is ???word2vec???, which is built with artificial neural networks. Our proposed model has a learning structure of bipartite layers using meta information between text data, with a focus on citation relationships. This structure reflects the latent topic of the text using the quoted information. It can solve the shortcomings of the conventional system based on the term-document matrix.ope

    GenHPF: General Healthcare Predictive Framework with Multi-task Multi-source Learning

    Full text link
    Despite the remarkable progress in the development of predictive models for healthcare, applying these algorithms on a large scale has been challenging. Algorithms trained on a particular task, based on specific data formats available in a set of medical records, tend to not generalize well to other tasks or databases in which the data fields may differ. To address this challenge, we propose General Healthcare Predictive Framework (GenHPF), which is applicable to any EHR with minimal preprocessing for multiple prediction tasks. GenHPF resolves heterogeneity in medical codes and schemas by converting EHRs into a hierarchical textual representation while incorporating as many features as possible. To evaluate the efficacy of GenHPF, we conduct multi-task learning experiments with single-source and multi-source settings, on three publicly available EHR datasets with different schemas for 12 clinically meaningful prediction tasks. Our framework significantly outperforms baseline models that utilize domain knowledge in multi-source learning, improving average AUROC by 1.2%P in pooled learning and 2.6%P in transfer learning while also showing comparable results when trained on a single EHR dataset. Furthermore, we demonstrate that self-supervised pretraining using multi-source datasets is effective when combined with GenHPF, resulting in a 0.6%P AUROC improvement compared to models without pretraining. By eliminating the need for preprocessing and feature engineering, we believe that this work offers a solid framework for multi-task and multi-source learning that can be leveraged to speed up the scaling and usage of predictive algorithms in healthcare.Comment: Accepted by IEEE Journal of Biomedical and Health Informatic

    Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation

    Full text link
    As an emerging field in Machine Learning, Explainable AI (XAI) has been offering remarkable performance in interpreting the decisions made by Convolutional Neural Networks (CNNs). To achieve visual explanations for CNNs, methods based on class activation mapping and randomized input sampling have gained great popularity. However, the attribution methods based on these techniques provide lower resolution and blurry explanation maps that limit their explanation power. To circumvent this issue, visualization based on various layers is sought. In this work, we collect visualization maps from multiple layers of the model based on an attribution-based input sampling technique and aggregate them to reach a fine-grained and complete explanation. We also propose a layer selection strategy that applies to the whole family of CNN-based models, based on which our extraction framework is applied to visualize the last layers of each convolutional block of the model. Moreover, we perform an empirical analysis of the efficacy of derived lower-level information to enhance the represented attributions. Comprehensive experiments conducted on shallow and deep models trained on natural and industrial datasets, using both ground-truth and model-truth based evaluation metrics validate our proposed algorithm by meeting or outperforming the state-of-the-art methods in terms of explanation ability and visual quality, demonstrating that our method shows stability regardless of the size of objects or instances to be explained.Comment: 9 pages, 9 figures, Accepted at the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21

    Homogeneous bilayer graphene film based flexible transparent conductor

    Full text link
    Graphene is considered a promising candidate to replace conventional transparent conductors due to its low opacity, high carrier mobility and flexible structure. Multi-layer graphene or stacked single layer graphenes have been investigated in the past but both have their drawbacks. The uniformity of multi-layer graphene is still questionable, and single layer graphene stacks require many transfer processes to achieve sufficiently low sheet resistance. In this work, bilayer graphene film grown with low pressure chemical vapor deposition was used as a transparent conductor for the first time. The technique was demonstrated to be highly efficient in fabricating a conductive and uniform transparent conductor compared to multi-layer or single layer graphene. Four transfers of bilayer graphene yielded a transparent conducting film with a sheet resistance of 180 {\Omega}_{\square} at a transmittance of 83%. In addition, bilayer graphene films transferred onto plastic substrate showed remarkable robustness against bending, with sheet resistance change less than 15% at 2.14% strain, a 20-fold improvement over commercial indium oxide films.Comment: Published in Nanoscale, Nov. 2011 : http://www.rsc.org/nanoscal

    A semi-analytic method with an effect of memory for solving fractional differential equations

    Get PDF
    In this paper, we propose a new modification of the multistage generalized differential transform method (MsGDTM) for solving fractional differential equations. In MsGDTM, it is the key how to impose an initial condition in each sub-domain to obtain an accurate approximate solution. In several literature works (Odibat et al. in Comput. Math. Appl. 59:1462-1472, 2010; Alomari in Comput. Math. Appl. 61:2528-2534, 2011; Gokdoğan et al. in Math. Comput. Model. 54:2132-2138, 2011), authors have updated an initial condition in each sub-domain by using the approximate solution in the previous sub-domain. However, we point out that this approach is hard to apply an effect of memory which is the basic property of fractional differential equations. Here we provide a new algorithm to impose the initial conditions by using the integral operator that enhances accuracy. Several illustrative examples are demonstrated, and it is shown that the proposed technique is robust and accurate for solving fractional differential equations.close0
    corecore